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ABSTRACT 


Traditionally, clustering algorithms focus on partitioning the 
data into groups of similar instances. The similarity objec- 
tive, however, is not sufficient in applications where a fair- 
representation of the groups in terms of protected attributes 
like gender or race, is required for each cluster. Moreover, 
in many applications, to make the clusters useful for the 
end-user, a balanced cardinality among the clusters is re- 
quired. Our motivation comes from the education domain 
where studies indicate that students might learn better in 
diverse student groups and of course groups of similar car- 
dinality are more practical e.g., for group assignments. To 
this end, we introduce the fair-capacitated clustering prob- 
lem that partitions the data into clusters of similar instances 
while ensuring cluster fairness and balancing cluster cardi- 
nalities. We propose a two-step solution to the problem: i) 
we rely on fairlets to generate minimal sets that satisfy the 
fair constraint and ii) we propose two approaches, namely 
hierarchical clustering and partitioning-based clustering, to 
obtain the fair-capacitated clustering. Our experiments on 
three educational datasets show that our approaches deliver 
well-balanced clusters in terms of both fairness and cardi- 
nality while maintaining a good clustering quality. 


Keywords 
fair-capacitated clustering, fair clustering, capacitated clus- 
tering, fairness, learning analytics, fairlets, knapsack. 


1. INTRODUCTION 


Machine learning (ML) plays a crucial role in decision-making 
in almost all areas of our lives, including areas of high soci- 
etal impact, like healthcare and education. Our work’s mo- 
tivation comes from the education domain where ML-based 
decision-making has been used in a wide variety of tasks from 
student dropout prediction [9], forecasting on-time gradua- 
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tion of students [15] to education admission decisions [21]. 
Recently, the issue of bias and discrimination in ML-based 
decision-making systems is receiving a lot of attention [28] 
as there are many recorded incidents of discrimination (e.g., 
recidivism prediction [20], grades prediction [4, 14]) caused 
by such systems against individuals or groups or people on 
the basis of protected attributes like gender, race etc. Bias 
in education is not a new problem. There is already a long 
literature on different sources of bias in education [24] or stu- 
dents’ data analysis [3] as well as studies on racial bias [31] 
and gender bias [22]. However, ML-based decision-making 
systems have the potential to amplify prevalent biases or cre- 
ate new ones and therefore, fairness-aware ML approaches 
are required also for the educational domain. 


In this work, we focus on fairness in clustering, as in edu- 
cational activities, group assignments [8] and student team 
achievement divisions [30] are important tools that help stu- 
dents working together towards shared learning goals. Clus- 
tering is an effective solution for partitioning students into 
groups of similar instances [3, 26]. Traditional algorithms, 
however, focus solely on the similarity objective and do not 
consider the fairness of the resulting clusters w.r.t.  pro- 
tected attributes like gender. However, studies indicate that 
students might learn better in diverse groups, e.g., mixed- 
gender groups [11, 32]. Lately, fair-clustering solutions have 
been proposed [1, 2, 5, 6], which aim to discover clusters with 
a fair representation regarding some protected attributes. In 
this work, we adopt the cluster fairness of [6], called clus- 
ter balance, according to which protected groups must have 
approximately equal representation in every cluster. 


In a teaching situation, it is obvious that the size of the 
groups should be comparable to allow a fair allocation of 
work among the students. As traditional clustering algo- 
rithms do not consider this requirement, clusters of varying 
sizes might be extracted, reducing the usefulness and ap- 
plicability of the partitioning for end-users/teachers. This 
leads to the demand for clustering solutions that also take 
into account the size of the clusters. The problem is known 
as capacitated clustering problem (CCP) [25] which aims to 
extract clusters with a limited capacity while minimizing 
the total dissimilarity in the clusters. Capacitated cluster- 
ing is useful in quite a few applications such as transfer- 
ring goods/services from the service providers (post office, 
stores, etc.), garbage collection and sales force territorial de- 
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sign [27]. To the best of our knowledge, no solution exists 
that considers both fairness and capacity of clusters on the 
top of the similarity objective. 


To this end, we propose a new problem, the so-called fair- 
capacitated clustering that ensures fairness and balanced 
cardinalities of the resulting clusters. We decompose the 
problem into two subproblems: i) the fairness-requirement 
compliance step that preserves fairness at a minimum thresh- 
old of balance score and ii) the capacity-requirement com- 
pliance step that ensures clusters of comparable sizes. For 
the first step, we generate fairlets [6], which are minimal sets 
that satisfy fair representation w.r.t. a protected attribute. 
For the second step, we propose two solutions for two differ- 
ent clustering types, namely hierarchical and partitioning- 
based clustering, that consider the capacity constraint dur- 
ing the merge step (hierarchical approach) /assignment step 
(partitioning approach). Experimental results, on three real 
datasets from the education domain, show that our methods 
result in fair and capacitated clusters while preserving the 
clustering quality. 


2. RELATED WORK 


Chierichetti et al. [6] introduced the fair clustering problem 
with the aim to ensure equal representation for each pro- 
tected attribute, such as gender, in every cluster. In their 
formulation, each instance is assigned with one of two colors 
(red, blue). They proposed a two-phase approach: clus- 
tering all instances into fairlets - small clusters preserving 
the fairness measure, and then applying vanilla clustering 
methods on those fairlets. Subsequent studies focus on gen- 
eralization and scalability. Backurs et al. [1] presented an 
approximate fairlet decomposition algorithm which can for- 
mulate the fairlets in nearly linear time thus tackling the effi- 
ciency bottleneck of [6]. Résner and Schmidt [29] generalized 
the fair clustering problem to more than two protected at- 
tributes. A more generalized and tunable notion of fairness 
for clustering was introduced in Bera et al. [2]. Anshuman 
and Prasant [5] introduced a fair hierarchical agglomerative 
clustering method for multiple protected attributes. 


The capacitated clustering problem (CCP), a combinatorial 
optimization problem, was first introduced by Mulvey and 
Beck [25] who proposed solutions using heuristic and sub- 
gradient algorithms. Several approaches exist to improve 
the efficiency of solutions or CCP approaches for different 
cluster types. Khuller and Sussmann [17], for example, in- 
troduced an approximation algorithm for the capacitated 
k-Center problem. Geetha et al. [10] improved k-Means 
algorithm for CCP by using a priority measure to assign 
points to their centroid. Lam and Mittenthal [19] proposed 
a heuristic hierarchical clustering method for CCP to solve 
the multi-depot location-routing problem. 


In this work, we introduce the problem of fair-capacitated 
clustering which builds upon the notions from fair cluster- 
ing and capacitated clustering. In particular, we build upon 
the notion of fairlets [6] to extract the minimal sets that 
preserve fairness. Regarding the CCP we follow the formu- 
lation of [25] to ensure balanced cluster cardinalities. To the 
best of our knowledge, the combined problem has not been 
studied before and as already discussed, comprises a useful 
tool in quite a few domains like education. 


3. PROBLEM DEFINITION 

Let X € R” be a set of instances to be clustered and let 
d() : X x X +R be the distance function. For an integer k 
we use [k] to denote the set {1,2,...,k}. A k-clustering C is 
a partition of X into k disjoint subsets, C = {C1, C2,..., Cr}, 
called clusters with S = {s1, s2,..., 8%} be the corresponding 
cluster centers. The goal of clustering is to find an assign- 
ment 6: X — [k] that minimizes the objective function: 


L(X0C)= SY dws) (1) 


s,ES rECy 


As shown in Eq. 1, the goal is to find an assignment that 
minimizes the sum of distances between each point © € 
Xand its corresponding cluster center s; € S. It is clear 
that such an assignment optimizes the similarity but does 
not consider fairness or capacity of the resulting clusters. 


Capacitated clustering: The goal of capacitated clustering 
[25] is to discover clusters of given capacities while still min- 
imizing the distance objective £(X,C). The capacity con- 
straint is defined as an upper bound Q, on the cardinality 
of each cluster C;: 


ICi| < Q: (2) 


Clustering fairness: We assume the existence of a binary 
protected attribute P = {0,1}, e.g., gender={male, female}. 
Let w : X — P denotes the demographic group to which the 
point belongs, i.e., male or female. 


Fairness of a cluster is evaluated in terms of the balance 
score [6] as the minimum ratio between two groups. 
- : xeEC;|p(2)=0 LEC; |p(x)=1 
balance(Cs) = min (HESS San Hacestets=oH ) 
(3) 
Fairness of a clustering C equals to the balance of the least 
balanced cluster C; € C. 


balance(C) = gnin, balance(C;) (4) 


We now introduce the problem of fair-capacitated cluster- 
ing that combines all aforementioned objectives regarding 
distance, fairness and capacity. 


Definition 1. (Fair-capacitated clustering problem) 

We define the problem of (t, k, q)-fair-capacitated clustering 
as finding a clustering C = {C1,C2,--- ,C,} that partitions 
the data X into |C| = k clusters such that the cardinality 
of each cluster C; € C does not exceed a threshold q, ie., 
|Ci| < q (the capacity constraint), the balance of each cluster 
is at least t, i.e., balance(C) > t (the fairness constraint), 
and minimizes the objective function L(X,C). Parameters 
k,t,q are user defined referring to the number of clusters, 
minimum balance threshold and maximum cluster capacity, 
respectively. 


4. FAIR-CAPACITATED CLUSTERING 


4.1 Fairlet decomposition 

Traditionally, the vanilla versions of clustering algorithms 
are not capable of ensuring fairness because they assign the 
data points to the closest center without the fairness con- 
sideration. Hence, if we could divide the original data set 
into subsets such that each of them satisfies the balance 
threshold t then grouping these subsets to generate the final 
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clustering would still preserve the fairness constraint. Each 
fair subset is defined as a fairlet. We follow the definition of 
fairlet decomposition by [6]. 


Definition 2. (Fairlet decomposition) 
Suppose that balance(X) > t with t = f/m for some in- 
tegers 1 < f < m, such that the greatest common divisor 
gcd(f,m) = 1. A decomposition F = {F, Fo,..., Fi} of X 
is a fairlet decomposition if: i) each point x € X belongs to 
exactly one fairlet F; € F; ii) |F;| < f+ m for each F; € F, 
i.e., the size of each fairlet is small; and iii) for each F; € F, 
balance(F;) > t, i.e., the balance of each fairlet satisfies the 
threshold t. Each Fj is called a fairlet. 


By applying fairlet decomposition on the original dataset 
X, we obtain a set of fairlets F = {Fi, Fo,..., Fi}. For 
each fairlet Fj; we select randomly a point r € F; as the 
center. For a point x € X, we denote y : X — [1,]] 
as the index of the mapped fairlet. The second step, is 
to cluster the set of fairlets F into k final clusters, sub- 
ject to the cardinality constraint. The clustering process 
is described below for the hierarchical clustering type (Sec- 
tion 4.2) and for the partitioning-based clustering type (Sec- 
tion 4.3). Clustering results in an assignment from fairless 
to final clusters: 6 : F — [k]. The final fair-capacitated 
clustering C can be determined by the overall assignment 
function $(x) = 6(Fy(a)), where y(a#) returns the index of 
the fairlet to which x is mapped. 


4.2 Fair-capacitated hierarchical clustering 
Given the set of fairlets: F = {Fi, Fo,..., Fi}, let W = 
{wi, w2,...,wi} be their corresponding weights, where the 
weight w,; of a fairlet F; is defined as its cardinality, i-e., 
number of points in F;. 


Traditional agglomerative clustering approaches merge the 
two closest clusters, so rely solely on similarity. We extend 
the merge step by also ensuring that merging does not vio- 
late the cardinality constraint w.r.t. the cardinality thresh- 
old q. 


THEOREM 1. The balance score of a cluster formed by the 
union of two or more fairlets, is at least t. 


balance(Y) > t, where Y = Uj;<iF; and balance(F;) > t 


PROOF. We use the method of induction to derive the 
proof. Assume we have a set of fairlets F = {Fi, Fo,..., Fi}, 
in which, balance(F;) >t, j = 1,...,l. We first con- 
sider the case for any two fairlets {F1, 2} € F. We have 


balance(F,) = fo > t and balance(F2) = oo >t. We 
1 


2 
denote by Y is the union of two fairlets F, and F», then 


balance(Y) = balance(F, U F2) = At he (5) 
my, + mg 
It holds: 
fi >t or, fi 2 = 
m4 mi +m2 —~ m1, + me 
wg 88 fe tme 
Similarly, = 
mi + me my + m2 (6) 
fi fe S tm tme2 
mitme.  mitm2~ mi+me | mit+me 
fit fe s t(mi + m2) = 
mitm2.~ mi+me 


Therefore, from Eq. 5 and Eq. 6 we get, 
balance(Y) >t (7) 


Thus, the statement given in Theorem 1 is true for any clus- 
ter formed by the union of any two fairlets. Now we assume 
that the statement holds true for a cluster formed from i 
fairlets, i.e, Y = Uj<i Fj, where 1 <i < J. Then, 


disihi . t (8) 


balance(Y) = Ta 
5a M5 


Consider another fairlet F;41 € F which is not in the formed 


cluster Y, balance(Fi+1) = Finoe >t. Then, by joining Fi+1 


Mit1 
with the cluster Y we get the new cluster y such that, 
fit + ae fj 


balance(Y ) = 


(9) 


Misi + Ki mj; 

Following the steps in Eq. 6, we can similarly show that, 
fini + Dijes fi 
Mit+1 + i<s M5 


Hence, the theorem holds true for cluster formed with i + 1 
fairlets if it is true for 7 fairlets. Since 7 is any arbitrary num- 
ber of fairlets, thus the theorem holds true for all cases. 


>t balance(y ) >t (10) 


Theorem 1 shows that for any cluster formed by union of 
fairlets, the fairness constraint is always preserved. Hence- 
forth, we don’t need further interventions w.r.t. fairness. 


The pseudocode is shown in Algorithm 2 of Appendix B. 
In each step, the closest pair of clusters is identified and a 
new cluster is created only if its capacity does not exceed 
the capacity threshold g. Otherwise, the next closest pair 
is investigated. The procedure continues until k clusters 
remain. The remaining clusters are fair and capacitated ac- 
cording to the correponding thresholds t and g. To compute 
the proximity matrix (line 1 and line 8), we use the distance 
between centroids of the corresponding clusters. The func- 
tion capacity(cluster) in line 5 returns the size of a cluster. 


4.3 Fair-capacitated partitioning clustering 
Partitioning-based clustering algorithms, such as k-Medoids, 
can be viewed as a distance minimization problem, in which, 
we try to minimize the objective function in Eq. 1. The 
vanilla k-Medoids does not satisfy the cardinality constraint 
since the allocating points to clusters step is only based on 
the distance among them. Now, if we change the goal of this 
assignment step to find the “best” data points with a defined 
capacity for each medoid instead of searching for the most 
suitable medoid for each point, we can control the cardinality 
of clusters. We formulate the problem of assigning points 
to clusters subject to a capacity threshold q as a knapsack 
problem [23]. 


Let S = {s1, 52,..., 8x} be the cluster centers, i.e., medoids, 
andC = {C1, C2, ..., Cy} be the resulting clusters. We change 
the assignments of points to clusters, using knapsack, in or- 
der to meet the capacity constraint g. In particular, we 
define a flag variable y; = 1 if x; is assigned to cluster Ci, 
otherwise y; = 0. Now, we define a value v; to data point 
x ;, which depends on the distance of x; to Ci, with v; being 
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maximum if C; is the best cluster for x;, i.e, the distance 
between x; and s; is minimum. We formulate value v; of 
instance x; based on an exponential decay distance function: 


(11) 


where d(x;, 8:) is the Euclidean distance between the point 
x; and the medoid s;. The higher is the lower the effect 
of distance in the value of the points. The point which is 
closer to the medoid will have a higher value. 


vj = a= fxd(x; 83) 


Then, the objective function for the assignment step is: 


n 
maximize y UY; 
j=l 


(12) 


Now, given F = {Fi, Fo,..., Fi} and W = {wi, we,..., wi} 
are the set of fairlets and their corresponding weights, i.e., 
the number of instances in the fairlet, respectively; q is the 
maximum capacity of the final clusters. Our target is to 
cluster the set of fairlets F into k clusters centered by k 
medoids. We apply the formulas in Eq. 11 and Eq.12 on the 
set of fairlets F, i.e, each fairlet Fj has the same role as «;. 
Then, the problem of assigning the fairlets to each medoid in 
the cluster assignment step becomes finding a set of fairlets 
with total weights less than or equal to q and the total value 
is maximized. In other words, we can formulate the cluster 
assignment step in the partitioning-based clustering as a 0-1 
knapsack problem. 


l 
maximize y UsiYj 
j=1 


(13) 


subject to So wy; <q and y; € {0,1} 


j=l 


In which, y; is the flag variable for F;, y; = 1 if F; is assigned 
to a cluster, otherwise y; = 0 ; v; is the value of F; which is 
computed by the Eq. 11; q is the desired maximum capacity. 


The pseudocode of our k-Medoids fair-capacitated algorithm 
is described in Algorithm 2. In which, for each medoid we 
would search for the adequate points (line 3) by using func- 
tion knapsack(p, values, w,q) (line 10) implemented using 
dynamic programming, which returns a list of items with a 
maximum total value and the total weight not exceeding q. 
In the main function, line 12, we optimize the clustering cost 
by replacing medoids with non-medoid instances when the 
clustering cost is decreased. This optimization procedure 
will stop when there is no improvement in the clustering 
cost is found (lines 19 to 32). 


5. EXPERIMENTS 


In this section, we describe our experiments and the perfor- 
mance of our proposed methods on three educational datasets. 


5.1 Experimental setup 
Datasets. The datasets are summarized in Table 1. 
UCI student performance. This dataset includes the de- 


mographics, grades, social and school-related features of stu- 
dents in secondary education of two Portuguese schools [7] in 
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Algorithm 1: k-Medoids fair-capacitated algorithm 
Input: F = {F1, F,..., Fi}: a set of fairlets 
W = {wi,we,..., wi}: weights of fairlets 
q: a given maximum capacity of final clusters 
k: number of clusters 
Output: A fair-capacitated clustering 
Function ClusterAssignment (medoids): 
clusters < Q; 
for each medoid s in medoids do 
candidates < all fairlets which are not assigned 
to any cluster ; 
p < length(candidates) ; 
w < weights(candidates) ; 
for each fairlet; in candidates do 
| values[i] << v(fairlet;) //Eq. 11 ; 
end 
cluster s[s|<-knapsack(p, values, w, q) ; 


end 
return clusters; 
Function main(): 
medoids + select k of the | fairlets arbitrarily ; 
ClusterAssignment(medoids) ; 
costyest < Current clustering cost; 
Sbest <— null ; 
Obest <— Null ; 
repeat 
for each medoid s in medoids do 
for each non-medoid o in F do 
consider the swap of s and 0, compute 
the current clustering cost; 
if current clustering cost < coStyest then 
Sbest <— 8} 
Obest <— O; 
costpest < current clustering cost; 
end 


end 


end 
update medoids by the swap of spese and Ovest } 
ClusterAssignment(medoids) 


until no improvements can be achieved by any 
replacement; 
return clusters; 


2005 - 2006. “Gender” is selected as the protected attribute, 
i.e., we aim to balance gender in the resulting clusters. 


Open University Learning Analytics (OULAD). This 
is the dataset from the OU Analyse project [18] of Open 
University, England in 2013 - 2014. Information of students 
includes their demographics, courses, their interactions with 
the virtual learning environment, and final outcome. We 
aim to balance the “Gender” attribute in the results. 


MOOC. The data covers students who enrolled in the 16 
edX courses offered by the two institutions (Harvard Univer- 
sity and the Massachusetts Institute of Technology) during 
2012 - 2013 [13]. The dataset contains aggregated records 
which represent students’ activities and their final grades of 
the courses. “Gender” is the protected attribute. 


Baselines. We compare against well-known fairness-aware 
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Table 1: An overview of the datasets 


Dataset #instances attributes Protected attribute Balance score 
UCI student performance 649 33 Gender (F: 383; M: 266) 0.695 

OULAD 4,000 12 Gender (F: 2,000; M: 2,000) 1 

MOOC 4,000 21 Gender (F: 2,000; M: 2,000) 1 


clustering methods and a vanilla clustering method. 


k-Medoids [16]: a traditional partitioning clustering tech- 
nique that divides the dataset into k clusters so as to mini- 
mize clustering cost. Cluster centers are actual instances. 


Vanilla fairlet [6]: a fairness-aware clustering approach that 
i) decomposes the dataset into fairlets and ii) applies a vanilla 
k-center clustering algorithm [12] to form the final k clusters. 


MCF fairlet [6]: Similar to Vanilla fairlet but the fairlet 
decomposition part is transformed into a minimum cost flow 
(MCF) problem, by which an optimized version of fairlet 
decomposition in terms of cost value is computed. 


Evaluation measures. We report on clustering quality (mea- 
sured as clustering cost, see Eq. 1), cluster fairness (ex- 
pressed as cluster balance, see Eq. 4) and cluster capacity 
(expressed as cluster cardinality). 


Parameter selection. Regarding fairness, a minimum thresh- 
old of balance ¢ is set to 0.5 for all datasets in our experi- 
ments. It means that the proportion of the minority group 
is at least 50% in each resulting cluster. Regarding the 
factor in Eq. 11, a value ’ = 0.3 is chosen for our experi- 
ments from a range of [0.1, 1.0] via grid-search. We evaluate 
the clustering cost and balance score on a small dataset, 
i.e., UCI student performance dataset - Mathematics sub- 
ject w.r.t A. Theoretically, the ideal capacity of clusters is 


ES where |X| is the population of dataset X, k is the 


number of desired clusters. However, in many cases, the 
clustering models cannot satisfy this constraint, especially 
the hierarchical clustering model. Hence, we set the formula 
in Eq. 14 to compute the maximum capacity q of clusters; ¢ is 
a parameter chosen in experiments for each fair-capacitated 
clustering approach. 


0 fey a 


To find the appropriate value of ¢, we set a range of [1.0, 
1.3] to ensure all the generated clusters have members and 
evaluate the cardinality of resulting clusters on the UCI stu- 
dent performance (Mathematics subject) dataset. ¢ is set to 
1.01 and 1.2, for k-Medoids fair-capacitated and hierarchical 
fair-capacitated methods, respectively. 


5.2 Experimental results 


UCI student performance. When k is less than 4, as shown 
in Figure l-a, the clustering quality of our models can be 
close to that of the vanilla k-Medoids method. However, 
the clustering cost is fluctuated thereafter due to the ef- 
fort to maintain the fairness and cardinality of methods. 
Our vanilla fairlet hierarchical fair-capacitated outperforms 
other competitors in most cases. Vanilla fairlet and MCF 
fairlet show the worst clustering cost as an effect of the k- 


Center method. Figure 1-b depicts the clustering fairness. 
As we can observe, in terms of fairness, vanilla fairlet hier- 
archical fair-capacitated has the best performance when k is 
less than 10. Contrary to that, by selecting each point for 
each cluster in the cluster assignment step, the k-Medoids 
fair-capacitated method can maintain well the fairness in 
many cases. Regarding the cardinality, as illustrated in Fig- 
ure 1-c, our approaches outperform the competitors when 
they can keep the number of instances for each cluster un- 
der the specified thresholds. 


OULAD. Our MCF fairlet k-Medoids fair-capacitated ap- 
proach outperforms other methods in terms of clustering 
cost, although there is an increase compared to the vanilla 
k-Medoids algorithm, as we can see in Figure 2-a. Con- 
cerning fairness, in Figure 2-b, k-Medoids is the weakest 
method while others can achieve the highest balance. The 
balance of Gender feature in the dataset is the main reason 
for this result. All fairlets are fully fair; this is a prerequi- 
site for our methods of being able to maintain the perfect 
balance. Regarding cardinality, our approaches demonstrate 
their strength in ensuring the capacity of clusters (Figure 2- 
c). The difference in the size of the clusters generated by 
our methods is tiny. This is in stark contrast to the trend 
of competitors. 


MOOC. The results of clustering quality are described in 
Figure 3-a (Appendix A). Although an increase in the clus- 
tering cost is the main trend, our methods outperform the 
vanilla fairlet and MCF fairlets methods. Regarding clus- 
tering fairness, as depicted in Figure 3-b, our approaches 
can maintain the perfect balance for all experiments. This 
is the result of an actual balance in the dataset and the 
fairlets. The emphasis is our methods can divide all the ex- 
perimented instances into capacitated clusters, as shown in 
Figure 3-c, which proves their superiority in presenting the 
results over the competitors regarding clusters’ cardinality. 


Summary of the results. In general, fairness is well main- 
tained in all of our experiments. When the data is fair, in 
case of OULAD and MOOC datasets, our methods achieve 
a perfect fairness. In terms of cardinality, our methods are 
able to maintain the cardinality of resulting clusters within 
the maximum capacity threshold, which is significantly su- 
perior to competitive methods. The fair-capacitated par- 
titioning based method is better than the hierarchical ap- 
proach since it can determine the capacity threshold closest 
to the ideal capacity. Regarding the clustering cost, the hi- 
erarchical approach has an advantage over other methods by 
outperforming its competitors in most experiments. 


6. CONCLUSION AND OUTLOOK 


In this work, we introduced the fair-capacitated clustering 
problem that extends traditional clustering, solely focusing 
on similarity, by also aiming at a balanced cardinality among 
the clusters and a fair-representation of instances in each 
cluster according to some protected attributes like gender 
or race. Our solutions work on the fairlets derived from 
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Figure 1: Performance of different methods on UCI student performance dataset 
a) Clustering quality (lower is better) b) Clustering fairness (higher is better) 
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Figure 2: Performance of different methods on OULAD dataset 
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ity. In the future, we plan to extend our approach for more 
than one protected attributes as well as to further investigate 
what fair group assignments means in educational settings. 
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APPENDIX 
A. MOOC DATASET 
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Figure 3: Performance of different methods on MOOC dataset 


B. HIERARCHICAL FAIR-CAPACITATED 
ALGORITHM 


Algorithm 2: Hierarchical fair-capacitated algorithm 
Input: F = {F1, Fo,..., Fi}: a set of fairlets 
q: a given maximum capacity of final clusters 
W = {wi,we,..., wi}: weights of fairlets 
k: number of clusters 
Output: A fair-capacitated clustering 
1 compute the proximity matrix ; 
2 clusters + F //each fairlet Fj is considered as cluster ; 
3 repeat 
4 cluster,, clusterg < the closest pair of clusters ; 
5 if capacity(cluster1) + capacity(cluster2) < q then 
6 newcluster < merge(clusteri, cluster2); 
7 update clusters with newcluster; 
8 update the proximity matrix ; 
9 else 
10 | continue; 
11 end 


12 until k clusters remain; 
13 return clusters; 
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